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Abstract 

We consider inference of the parameters of the diffusion term for Cox-Ingersoll-Ross 
and similar processes with a power type dependence of the diffusion coefficient from the 
underlying process. We suggest some original pathwise estimates for this coefficient and 
for the power index based on an analysis of an auxiliary continuous time complex valued 
process generated by the underlying real valued process. These estimates do not rely on the 
distribution of the underlying process and on a particular choice of the drift. Some numerical 
experiments are used to illustrate the feasibility of the suggested method. 
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1 Introduction 

In this paper, we consider inference of the diffusion term for Cox-Ingersoll-Ross and similar 
processes with a power type dependence of the diffusion coefficient from the underlying process. 
These processes are important for applications; in particular, they are used for interest rate 
models and for volatility models in finance; see, e.g., Heston (1993), Gibbons and Ramaswamy 
(1993), Lewis (2000), Zhou (2001), Carr and Sun (2007), Andersen and Lund (1997), Gourier- 
oux and Monfort (2013), Fergusson and Platen (2015), Hin and Dokuchaev (2016), and the 
bibliography therein. Estimation of the parameters for these models was widely studied; see e.g. 
Gibbons and Ramaswamy (1993), Andersen and Lund (1997), Kessler (1997), Sprensen (2000), 
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Zhou (2001), Fan et al (2003), Ait-Sahalia (1996), De Rossi (2010), Gourieroux and Monfort 
(2013). 

We readdress the problem of inference for these processes. We suggests a new method that 
allows to obtain pathwise estimates of the diffusion coefficient and the power index represented 
as explicit functions dehned on an auxiliary continuous time complex valued process generated 
by the underlying real valued process. An attractive feature of the method is that it not require 
to estimate neither parameters of the drift nor the distributions of the underlying process. In 
particular, one does not need to know the shape of the likelihood function. In addition, our 
method allows to consider models with a large number parameters for the drift; therefore, 
it allows to cover cases where the Maximum Likelihood method is not feasible due to high 
dimension. This is especially beneficial for hnancial application where the trend for the prices 
is usually difficult to estimate since it is overshadowed by a relatively large volatility. Since 
the drift is excluded from the analysis, our method does not lead to an estimation of the drift. 
However, this could be a useful supplement to the existing more comprehensive methods such 
as described in Gibbons and Ramaswamy (1993), Andersen and Lund (1997), Sprensen (2000), 
Zhou (2001), Fan et al (2003), Ait-Sahalia (1996), De Rossi (2010), Gourieroux and Monfort 
(2013), and Kessler (1997). These works used estimation of the parameters for drift term; on 
the other hand, the method discussed in the present paper allows to bypass this task. 

Feasibility an robustness of the suggested method is illustrated with some numerical exper¬ 
iments. 

2 The model 

Let 0 G R and T G {0, -|-oo) be given. We are also given a standard complete probability 
space (DjT', P) and a right-continuous filtration of complete u-algebras of events. In 

addition, we are given an one-dimensional Wiener process w{t)\t(z\ 0 ^T]-: that is a Wiener process 
such that w{9) = 0 adapted to {Ft} and such that Ft is independent from rc(s) — w{q) if 
t > s > q > 6. 

Gonsider a continuous time one-dimensional random process y{t)\t>e such that y{6) > 0 and 

dy{t) = f{y{-),t)dt + a{t)y{tydw{t), t e {e,T). (2.1) 

Here 7 G [0,1], a{t) is a bounded J-)-adapted process, f{s,t) : C{[9,T]) x [0,T] —)• R is a 
measurable function such that f{s,t) is T)-adapted for any s G C{[6,T]), and that f{si,t) = 
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f{s 2 ,t) if (si — S 2 )|[ 6 ),t] = 0. In addition, we assume that, for any <5 > 0, \f{si,t) — f{s 2 ,t)\ < 
Cl||si — S 2 ||c([o,T]) and |/(s, t)| < C 2 (||s||c([ 0 ^ 7 ’]) + 1) for some constants Cfc = Cfc(5) > 0 a.s. (almost 
surely) a for all t £ [0,T], si,S 2 £ C{[9,T]), such that ^'<^Ue[ 8 ,T],k=i ,2 Sk(t) > 6. 

Under these assumptions, there exists a Markov time r with respect to {Tt} with values in 
(0,T] such that there exists an unique almost surely continuous solution y(s)|sG[6»,T] such that 
infie[(9,r] y{s) > 0. 


Examples of applications in financial modelling 

The assumptions on the process y allows to use it for a variety of hnancial models. In particular, 
the assumption on the drift coefficient / allows to consider a path depending evolution such as 
described by equations with delay; see some examples in Stoica (2005) and Luong and Dokuchaev 
(2016). 

The assumptions on the diffusion coefficient allow to cover many important financial models. 
In particular, the so-called Cox-Ingersoll-Ross process is used for the interest rate models and 
the volatility of stock prices (Heston (1993)). This equation has the form 

dy{t) = a\j3 — y{t)\dt + ay{tY^‘^dw{t), t > 0, (2.2) 

where a > 0 , 6 > 0 , and a > 0 are some constants. 

A more general model introduced in Chan et al. (1992) 

dy{t) = a[b — y{t)]dt + ay{tydw{t), t > 0, (2.3) 

is called a Chan-Karolyi-Longstaff-Sanders (CKLS) model in the econometric literature; see e.g. 
lacus (2008). This equation (12. ip with 7 = 2/3 is also used for volatility modelling; see, e.g., 
Carr and Sun (2007) and Lewis (2000). 

3 The main result 

Up to the end of the paper, we assume conditions for / and a formulated above holds. We 
assume below that r is a Markov time with respect {T)} such that r £ {0,T] a.s. and that 
inf^gje,-] y{s) > 0 a.s.. In particular, one can select t = T Ainfjs > 6 : y{s) < M} for any given 
Af £ (O,2/(0)). 

Our main tool for estimation of the pair (a, 7 ) will be provided by the following theorem. 
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Theorem 3.1 For any /i G [0,1], 

y[sf('y-’^)a{s)^ds = 21og|yh('r)| a.s., 

where Yh{s) is a complex-valued process defined for s G [0,r] such that 

dYh{s) = iYh{s)^^j^, se{9,T), 

y{sr 

Yh{e) = 1 . 

In (j3.2p . i = y/—l is the imaginary unit. 

Corollary 3.1 (a) We have that 

a{s)'^ds = 2log |Ty(r)| a.s.. 

(b) If cr{t) = a is constant, then, for any h G [0,1], 

o'^ = 2(^ d^ log|yft(r)| a.s.. 




(3.1) 


(3.2) 


(3.3) 


(3.4) 


4 Applications of Theorem 13.11 to estimation of ( 7 , a] 


Up to the end of this paper, we assume that a{t) = cr is an unknown positive constant. 

We present below estimates of (u, 7 ) based on available samples {y{tk)}, where tk G [9,tAT], 
such that tk+i = tk + 6 for k = mg, ruo + 1,..., m — 1, 6 = {t AT — 9)/{m — mo), tmo = 9 and 
tm = T AT. In this setting, y{tk) > 0 for A; = mo,..., m. 

For h G [0,1], let 

2/(4) - y(4-i) 

~ yihW' ' 

Estimation of a under the assumption that 7 is known 

Let us first suggest an estimate for a under the assumption that 7 is known. 

Corollary 4.1 For any h G [0,1], the value a can he estimated as ^^ 7 , where 

( m \ m 

^ (4-1) 

fc=mo+l / /c=mo+l 
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Estimation of 7 with excluded a 

It appears that Theorem 13.11 implies some useful properties of the process Yh{t) allowing to 
estimate 7 in a setting with unknown constant a. 


Proposition 4.1 For any hi,h 2 G [0,1], 

!e ^ log|yfei(r)| 

f^yis)^h-h2)ds log|y,,(T)| • ^ ^ 

Since calculation of and Yh^{T) does not require to know the values of /, 7 , and u, 

property (14.21) allows to calculate 7 as is shown below. 


Corollary 4.2 An estimate ^ of ^ can be found as a solution of the equation 

l°e(i + 

Efc.mo + 1 Etmo + l >Oe{l + ’ 

for any pair of pre-selected hi and /i 2 - 


(4.3) 


It can be noted that cr remains unused and excluded from the analysis for the method described 
in Proposition 14.11 and Corollary 14.21 respectively, this method does not lead to an estimate of 
a. 


Estimation of the pair {a, 7) 

Proposition 4.2 The process 

^log|Ky(f Ar)| (4.4) 

is a.s. constant in t ^ [G^Y]. 

Let 


and 


Vh,k = '^og{l + r]lj), k = mo + l,...,m, 


Vh 


1 


m — rrio 


m 

^ Vh,k- 
j=mo-\-l 
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Corollary 4.3 An estimate of'j can be found as the solution of the optimization problem 

m 

Minimize E {vh,k-Vhf over /ie[0,1]. (4.5) 

k=mQ-\-l 

In this case, a can be estimated as 

a = ^v^/6. (4.6) 

where 7 is the estimate of ^ obtained as a solution of |.^.5D . 

Remark 4.1 Corollary \4.^ allows the following modefication a special case of estimation of j 
for the case where a is a known constant a: an estimate ^ of 'j can be found as the solution of 
the optimization problem 

m 

Minimize {vh,k/^ — o'^) over /iG[0, 1]. 

k=mQ 

5 Proofs 


For M G (0,^(6*)), let tm = t A sup{s G [9,T] : y{q) > M}. Clearly, 

tm t as M —>• 0 a.s. (5-1) 

Proof of Theorem I,?. 11 The proof follows the idea of the proof of Lemma 3.2 from 
Dokuchaev (2014), where less general log-normal underlying processes were considered. Let 
a{t) = f{y{t),t)y{t)~^- We have, for any M G (0,^(0)), 

dYh{t) = iYh{t)\d{t)dt + y{ty~^a{t)dw{t)], t G { 9 , tm )- 


By the Ito formula again, for any M G (0, ?/(0)), 


( r^M p fTM fTM 

ij a{s)ds-—J y{sf^'^~^^a{sfds + ij y{sy~^a{s)dw{s) 

r'^M 1 fTM pTM 

/ a{s)ds + - / y{s)‘^^'^~^^a{s)‘^ds + i / y{sy~^cr{s)dw{s) 

Je 2 Jq Jq 


= exp z 


Hence 


|ih(TM)l = exp y{sf^'^ ’^'^a{sfd^ a.s. 
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and 


fTM 

/ y{sf^^~^^a{sfds = 2\og\Yh{TM)\ a.s.. 

Je 

Hence (j3.ip follows from ()5.1I) . □ 

Proof of Corollary \3.1\ follows immediately from Theorem 13.11 □ 

Proof of Corollary \4-l\ Let tm = tm, tm^ = d, and let = tmo + (^ — ^ 0)8 if mo < k < m. 
Let be defined by (4.1). The Euler-Maruyama time discretization of (|3.2p leads to the 
stochastic difference equation 

yh{tk) = yh{tk-i) + iyh{tk-i)r]h,k, k>mo + l, 
yh{tmo) = L 

(See, e.g., Kloeden and Platen (1992), Ch. 9). This equation can be rewritten as 

yh{tk) = 3^/i(4-i)(l + k>mo + l, 

yh{tmo) = L 

Hence 

m 

yh{tm) = {l+ir]h,k)- 

k=mQ-\-l 

Clearly, 

m m 

\yh{tm)\= n \^+^3hA= n 

A:=mo+l A:=mo+l 

and 

ra ^ m 

log|Th(tm)|= log[(l + hI,fc)^^^] = 2 Y + (5-2) 

fc=mo+l /c=mo+l 

Then (j3.3p leads to estimate (j4.ip . □ 

Proof of Proposition \f.l\ and Proposition If.21 follows immediately from Theorem 13.11 and 
Proposition 13. H b). □ 

Proof of Corollary If.^ follows from the natural discretization of integration and (j5.2p . □ 
Proof of Corollary If.t^ It follows from (15.2h that the sequence {log |T/i(4)|} represents the 
discretization of the continuous time process log \ Yh{tAT)\ at points t = tk] this process is linear 
in time for h = 7 and 2 log \Yj{t A r)| = (t A r — d)a‘^. Hence 

21og|Ky(4+i)| - 21og|Ky(tA:)| = 8 a‘^. 
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On the other hand, (15.21) implies that 


2log \yh{tk+i)\ - 2log \yh{tk)\ = Vh,k, h G [0,1]. 

This leads to an optimization problem 

m 

Minimize E {vh,k/5-cf over /iG[0 , 1], c> 0. 

k=mQ-\-l 

By the properties of quadratic optimization, this problem can be replaced by the problem 

m 

Minimize E {vh,k/S-Vh)^ over /iG(0, 1]. (5.3) 

fc=mo+l 

Then the proof follows. □ 

Proof of Remark \4.1\ repeats the previous proof without optimization over c. □ 

6 Numerical experiments 

To illustrate numerical implementation of the algorithms described above, we applied these 
algorithms for discretized Monte-Carlo simulations of some generalized version the Cox-Ingersoll- 
Ross process (12.2[) . We consider a toy example of a process with a large number of parameter. 
Presumably, estimation of all these parameters is not feasible due a high dimension for a method 
of moments or Maximum Likelihood Method. 

We consider a process evolving as the following: 

dy{t) = H {y{t),y{m8Lx{t — X,0)) dt + ay{tydw{t), t > 0, (6.1) 

where 

N 

H{x,y) = [^k{x) + Gk{y )], 

k=l 

Fk{x) = ak[bk - + CkCos{dkX + e^), Gk{x) = 0.1dk\pk - 

The parameters A^, a^, 6^, c^, e^, a^, 6^, e^, %, A are randomly selected in each experi¬ 

ment. In particular, the integers N are selected randomly at the set {1,2, 3, 4, 5} with equal 
probability. The delay parameter A has the uniform distribution on the interval [0,0.2]. The 
parameters ak,bk,k'k,Ck,dk,ek,dk,bk,ek, r'ki^k are uniformly distributed on the interval [0,1]. 



For the Monte-Carlo simulation, we considered corresponding discrete time process {y(tfc)} 
evolving as 

y{tk+i) = y{tk) + ^k+i, k = 0,...,n, (6.2) 

with mutually independent random variables ^k from the standard normal distribution A^(0,1). 
Here 6 = tk+i — tk = 1/n this corresponds to [ 6 ,T] = [0,1] for continuous time underlying model. 
The delay £ is the integer part of A(T — 6 )/{n + 1). 

We considered n € {52,250,10000,20000}. For the financial applications, the choice of 
n = 52 corresponds to weekly sampling; the choice of n = 250 corresponds to daily sampling. 

In the Monte-Carlo simulation trials, we considered random y(tmo) uniformly distributed on 
[0.1,10] and truncated paths 2 /(tfc)|Tno<A:<m, with the Markov stopping time m = n A inf{A: : 
y(tk) < O.OOly(tmo)}- III this case, y{tk) > 0 for A: = mo,...,m. To exclude the possibility 
that y{tm) < 0 (which may happen for our discrete time process since the values of ^k aie 
unbounded), we replace y{tm) dehned by (j 6 . 2 p by y{tm) = y{tm-i) > 0 every time when m < n 
occurs. It can be noted that, for our choice of parameters, the occurrences of the event m < n 
were very rare and have not an impact on the statistics. 

We used 10,000 Monte-Carlo trials for each trial (i.e. for each entry in each of the tables 
l6.m6.3l belowl. We found that enlarging the sample does not improve the results. Actually, the 
experiments with 5,000 trials or even 1,000 Monte-Carlo trials produced the same results. 

The parameters of the errors obtained in these experiments are quite robust with respect to 
the change of other parameters as well. 

We denote by E the sample means of the corresponding values over all Monte-Carlo simula¬ 
tion trials. For the estimates id, 7 ) of (< 7 , 7 ), we evaluated the root mean-squared errors (RMSE) 
^E |a — a\^ and [7 — 7 ^, the mean errors Eja — crj and E [7 — 7 ], and the biases E(a — cj) 
and E (7 — 7 ). 

In the experiment described below, we used a = 0.3, 7 = 1/2, and 7 = 0.6. 

Estimation of a using Corollary 14.11 

The numerical implementation of Corollary 14. II requires to use the value 7 . In other words, one 
have to use certain hypothesis about the value of 7 , for instance, based on estimation of 7 that 
was done separately. This setting leads to an error caused by miscalculation of 7 . 
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To illustrate the dependence of the error for the estimate of a from the error in the hypothesis 
on 7 , we considered estimates for inputs simulated with 7 = 1/2 and with different h. 

Tables [ 6 T] (a),(b) show the parameters of the errors in the experiments described above for 
estimate (BU) with different h and with 7 = 1/2, for 6 = 1/52 and 5 = 1/250 respectively. 
Numerical experiments shows that these estimates are robust with respect to small errors for 7 ; 
however, the estimation error for a caused by misidentification of 7 can be significant. 

Estimation of 7 with unknown cr using (14. 3 p and (14. 5 p 

In these experiments, we used simulated process with 7 = 0.6 and estimates (j4.3h and (j4.5p . 

For solution of equation (j4.3h and optimization problem (j4.5p . we used simple search over 
a finite set = {k/N}^^^. We used N = 300 for (14.3p and N = 30 for (j4.5p . Further 

increasing of N does not improve the results but slows down calculation. 

It appears that estimation of 7 is more numerically challenging than estimation of a using 
(j4.1h with known 7 . In our experiments, we observed that the dependence of the value of criterion 
function in (14.5p depends on h smoothly and the dependence on h for each particular Monte- 
Carlo trial is represented by an U-shaped smooth convex function. However, the minimum point 
of this functions is deviating significantly for different Monte-Carlo trials, especially in the case 
of low-frequency sampling. It requires high-frequency sampling to reduce the error 7 — 7 . Table 
16.21 shows the parameters of the error 7 — 7 . We found that these parameters are quite robust 
with respect to the change of other parameters of simulated process. 

Estimation of a using (14.5|i 

The solution of optimization problem (14.511 gives an estimate of a, in addition to an estimate of 
7 , in the setting with unknown u, via ()4.6I) . This gives a method for estimation of a tat can be 
an alternative to estimator (j4.1ll . Tableshows the parameters of the error a — a. It appears 
that the RMSE is larger than for estimators ()4.1I) applied with a correct h = j and has the same 
order as the RMSE for this estimators applied with h 7 ^ 7 , i.e., if 7 is ’’miscalculated”. 
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(a) .5 = 1/52 



\/e|7-7|2 

EI 7 - 7 I 

E (7 - 7 ) 

7 = 0.5, h = 0.5 

0.0312 

0.0248 

0.0034 

II 

0 

II 

0 

0.0458 

0.0365 

0.0281 

7 = 0 . 6 , h = 0.5 

0.0358 

0.0290 

-0.0183 

7 = 0 . 7 , h = 0.5 

0.0495 

0.0413 

-0.0370 


(b) (5 = 1/250 



\/e| 7 - 7p 

E|7 - 7 I 

E (7 - 7 ) 

7 = 0.5, h = 0.5 

0.0136 

0.0109 

0.0006 

7 = 0.4, h = 0.5 

0.0328 

0.0272 

0.0259 

7 = 0 . 6 , h = 0.5 

0.0269 

0.0227 

-0.0215 

7 = 0.7, h = 0.5 

0.0468 

0.0416 

-0.0414 


Table 6.1: Parameters of the error a — a for 'a obtained from estimates (j4.1l) with b = 1/52 
and b = 1/250. In the first column, 7 is the ’’true” power used for simulation, and h is the 
parameter of (14.111 used for estimation; mismatching of 7 and h leads to a larger bias and a 
larger estimation error. 

Comparison with the performance of other metods 

Sprensen (2000) and Zhou (2001) reported the results of testing of a variety of estimators based 
on the maximum likelihood method or the method of moments for special cases of (12.ip . These 
works considered simulated processes with a preselected structure for the drift term with a low 
dimension of the vector of parameters. Due to numerical challenges for the methods used, the 
number of Monte Carlo trials was relatively short in these works (100 trials in Sprensen (2000) 
and 1,000 trials in Zhou (2001)). Sprensen (2000) considered model (j2..S|) with one fixed set 
of parameters (a, b) for the drift, and Zhou (2001) considered model (j2.2p for a variety of the 
parameters (a, b) for the drift. Sprensen (2000) considered estimation of (cr, 7 ) and estimation 
of the drift parameters, and Zhou (2001) considered estimation of cr and estimation of the drift 
parameters with fixed 7 = 1 / 2 . 

The results for cr in Table 5 from Zhou (2001) reported for b = 1/500 depends significantly 
on the choice of the the drift parameters {a,b) in (12.2p (in our notations). The minimal RMSE 
for estimates of cr among all pairs (a, b) is of the same order as the RMSE reported in our Table 
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Y^E|7 - 7p 
for (j4.3l) 

EI 7 - 7 I 
for (14.3p 

E (7 - 7 ) 

for (14.31) 

Y^E|7 -7|^ 

for (j4.5p 

EI 7 - 7 I 
for (j4.5l) 

E (7 - 7 ) 
for (j4.5p 

6 = 1/250 

0.2078 

0.1736 

0.1078 

0.2304 

0.1946 

0.1166 

= 1 / 10,000 

0.0309 

0.0182 

0.0039 

0.0356 

0.0221 

0.0042 

(5 = 1 / 20,000 

0.0222 

0.0109 

0.0020 

0.0483 

0.0294 

0.0004 


Table 6.2: Parameters of the error 7 


7 for the solution of (I4.3p and ()4.5p with an unknown a. 



y^E \a — (T p 

E ? — fj 

E(a — a) 

6 = 1/250 

0.0515 

0.0264 

0.0092 

<5 = 1/10,000 

0.0063 

0.0038 

0.0001 

<5 = 1/20,000 

0.0168 

0.0108 

0.00003 


Table 6.3: Parameters of the error a — a for a obtained from (14.6p and (14.5p with an unknown 

7- 

16.H al for <5 = 1 /250 for the case of known h] for other choices of the drift the RMSE in Table 5 
from Zhou (2001) is much larger. Remind that RMSE is smaller for smaller 5. 

The RMSE for a reported in Table ILl from Sprensen (2000) for <5 = 1/500 (in our notations) 
is approximately the same as in Table 16.31 for 5 = 1/250. However, the RMSE for 7 with 

6 = 1/500 is three times smaller in Table II. 1 from Sprensen (2000) for some estimators than 
in Table [6]2] with 5 = 1/250. However, it may happen that the performance of the estimators 
in Table H.l from Sprensen (2000) is not robust with respect to different choices of the drift 
parameters, similarly to the case presented in Table 5 from Zhou (2001) for 7 = 1/2. On the 
other hand, our method allowed to include a high variety of drift models with almost unlimited 
dimension, and, as we found in some unreported experiments, the choice of particular drifts does 
not have an impact on the performance of the estimator. 

7 On the consistency of the method 

Let us describe briefly the consistency of the method. Clearly, one cannot the real life data 
such as market prices are generated by model (j2.1h . Hence we restrict our consideration by 
the error for simulated data. The equations in the continuous time used for our method are 


12 






















exact and hold almost surely for continuous time underlying processes (j2.ip . Therefore, the 
only source of the error is the time discretization error. This error is inevitable since the the 
method requires pathwise evaluation of stochastic integrals. Let us discuss briefly consistency 
of the method as convergence of the estimates to the true values as the sampling frequency is 
increasing, i.e. <5 —)■ 0. A rigorous analysis of convergency for the time discretization requires 
significant analytical efforts outside of the scope of this paper; see e.g. Kloeden and Platen 
(1992) and Jourdain and Kohatsu-Higa (2011), where review of the recent literature can be 
found. We leave this analysis for the future research and give below a short sketch of two 
possible approaches. 

There are two options. First, one can consider Euler-Maruyama time discretization for the 
pair (y, Y^) such as described for the numerical experiments described above. In this case, / 
and the sampling frequency 5 have to be such that a satisfactory approximation is achieved. 
In particular, by Theorem 9.6.2 from Kloeden and Platen (1992), p. 324, these conditions are 
satisfied for CIR models as well as for the case where = f{y{t),t). Some analysis o and 

conditions for the convergence in more general cases can be found in Jourdain and Kohatsu-Higa 
(2011). The numerical experiments described above demonstrate that the required convergence 
takes place for equtations will delay modelled there. 

Another option is to consider convergence of the method for J —)• 0 given that T(tfc) are 
constructed with the ’’true” entries y{t). We presume here that it is possible to produce an 
arbitrarily close approximation of a continuous path y(t) via Monte-Carlo simulation with in¬ 
creasing of the simulation frequency. Let ts{t) be selected as tk such that \t — tk\ = miup \ t — tp\ 
(for certainty, let it me the minimal point if t is in the middle of a sampling interval). Clearly, 
Esup 4 g[o,^] \yits{t)) - y(t)P ^ 0 as J ^ 0. Hence ^ le y{sf^'^~^'^ds in 

probability as <5 —)• 0. Further, can be shown that log |Th(t( 5 (t))| —^ log \Yh{t)\ in probability as 
J —)• 0. This leads to converges of estimates to their true values in probability. 

8 Discussion 

(i). The estimates listed in Section 0] do not use neither / nor the probability distribution of the 
process {y{t)}. In particular, they are invariant with respect to the choice of an equivalent 
probability measure. This is an attractive feature that allows to consider models with a 
large number of parameters for the drift. 
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(ii). It appears that estimation of the power index 7 with unknown a is numerically challenging 
and requires high-frequency sampling to reduce the error. Perhaps, this can be improved 
using other modifications of (14.5p and other estimates for the degree of nonlinearity for the 
implementation of Lemma 14.21 In particular, the standard criterions of linearity for the 
first order regressions could be used, and L 2 -type criterions could be replaced by Lp-type 
criterions with p ^ 2. So far, we were unable to find a way to reduce the error for lower 
sampling frequency. We leave it for the future research. 

(hi). Our approach does not cover the estimation of the drift / which is a more challenging 
problem. However, the estimates for (<7,7) suggested above can be used to simplify sta¬ 
tistical inference for / by reduction of the dimension of the numerical problems arising 
in the maximum likelihood method, methods of moments, or least squares estimators, for 
(/, cr, 7). This can be illustrated as the following. 

Assume that 7 = 1/2 is given and that evolution of y{t) is described by Cox-Ingersoll-Ross 
equation ()2.2j) with 0 = 0. It is known that 

Ey(r) = 6(l-e-“^)+e-“^y(0), 

Var y(r) = ^6(1 - 6““'^) + - e-“'^)y(0). 

2 a 

(See, e.g., Gourieroux and Monfort (2013)). This system can be solved with respect to 
(a, 6) given that Ey(T) and Vary(T) are estimated by their sampling values, and a is 
estimated as suggested above. 

(iv). The paper focuses on the case where a{t) is constant. However, some results can be 
extended on the case of time depending and random cr(t). For example, the proofs given 
above imply that a{s)‘^ds can be estimated as 

m 

^ 7 , 7 = log(l + 07,fc)- 

k=mo-\-l 

Acknowledgments 

The author gratefully acknowledges support provided by ARC grant of Australia DP120100928. 


14 



References 


[1] Ait-Sahalia, Y. (1996). Nonparametric pricing of interest rate derivative securities, Econo- 
metrica 64, 527-560. 

[2] Andersen, T.G., Lund, J. (1997). Estimating continuous-time stochastic volatility models 
of the short-term interest rate. Journal of Econometrics 77 (1997) 343-377. 

[3] Carr, P. and Sun, J. (2007). A new approach for option pricing under stochastic volatility. 
Review of Derivatives Research 10, 87-250. 

[4] Chan, K.C., Karolyi, C.A., Longstaff, F.A., Sanders, A.B. (1992). An empirical investiga¬ 
tion of alternative models of the short-term interest rate, J. Finance^ 47, 1209-1227. 

[5] De Rossi, C. (2010). Maximum likelihood estimation of the Cox-Ingersoll-Ross model using 
particle filters. Computational Economics 36, Iss. 1, 1-16. 

[6] Dokuchaev, N. (2014). Volatility estimation from short time series of stock prices. Journal 
of Nonparametric Statistics 26 (2), pp. 373-384. 

[7] Fan, J., Jiang, J., Zhang, Z. and Zhou, Z. (2003). Time-dependent diffusion models for 
term structure dynamics. Statistica Sinica 13, 965-992. 

[8] Fergusson, K., and Platen, E. (2015). Application of maximum likelihood estimation to 
stochastic short rate models. Annals of Financial Economics 10, 1550009 (26 pages). 

[9] Cibbons, M.R., and Ramaswamy, K. (1993). A test of the Cox, Ingersoll, and Ross model 
of the term structure. Review of Financial Studies 6, iss.3, 619-658. 

[10] Courieroux, C., and Monfort, A. (2013). Pitfalls in the estimation of continuous time 
interest rate models: the case of the CIR Model. Annals of Economics and Statistics, No. 
109/110, 25-61. 

[11] Heston, S. (1993). Closed-form solution for options with stochastic volatility, with appli- 
cationto bond and currency options. Review of Financial Studies 6(2), 327-343. 

[12] Hin, Lin-Yee and Dokuchaev N. (2016). Short rate forecasting based on the inference from 
the CIR model for multiple yield curve dynamics. Annals of Financial Economics 11, 
1650004 (33 pages). 


15 


[13] lacus, S.M. (2008). Simulation and Inference for Stochastic Differential Equations With R 
Examples. Springer. 

[14] Kessler, M. (1997). Estimation of an ergodic diffusion from discrete observations. Scand. 
J. Statist. 24, 211-229. 

[15] Kloeden, P. E., Platen, E. (1992). Numerical Solution of Stochastic Differential Equations. 
Berlin etc.. Springer-Verlag. 

[16] Lewis, A. L. (2000). Option valuation under stochastic volatility. Finance Press, Newport 
Beach, California, USA. 

[17] Luong, C. and Dokuchaev, N. (2016). Modelling dependency of volatility on sampling 
frequency via delay equations. Annals of Financial Economics, Vol. 11, No. 2 (June 2016) 
1650007 (21 pages). 

[18] Sporensen (2000). Inference for Diffusion Processes and Stochastic Volatility Models. Ph.D. 
thesis. University of Copenhagen. Denmark. 

[19] Stoica, G. (2005). A stochastic delay financial model. Proceedings of the American Math¬ 
ematical Society, 133(6), 1837-1841. 

[20] Zhou, H. (2001). Finite Sample Properties of EMM, GMM, QMLE, and MLE for a Square- 
Root Interest Rate Diffusion Model. Journal of Computational Finance 2, 89-122. 


16 


